Adapting a Parser to Clinical Text by Simple Pre-processing Rules
نویسنده
چکیده
Sentence types typical to Swedish clinical text were extracted by comparing sentence part-of-speech tag sequences in clinical and in standard Swedish text. Parsings by a syntactic dependency parser, trained on standard Swedish, were manually analysed for the 33 sentence types most typical to clinical text. This analysis resulted in the identification of eight error types, and for two of these error types, preprocessing rules were constructed to improve the performance of the parser. For all but one of the ten sentence types affected by these two rules, the parsing was improved by pre-processing.
منابع مشابه
Adapting a general parser to a sublanguage
In this paper, we propose a method to adapt a general parser (Link Parser) to sublanguages, focusing on the parsing of texts in biology. Our main proposal is the use of terminology (identi cation and analysis of terms) in order to reduce the complexity of the text to be parsed. Several other strategies are explored and nally combined among which text normalization, lexicon and morpho-guessing m...
متن کاملA Parser-based Text Preprocessor for Romanian Language Tts Synthesis
Text preprocessing plays an important role in a textto-speech (TTS) synthesis system. The correct detection and interpretation of input strings influence the overall system accuracy and contribute to the conversion of an unrestricted text into synthetic speech. This paper describes the design philosophy of a preprocessing module for a TTS system in Romanian language. The preprocessor is impleme...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملA parser-based text preprocessor for romanian language TTS synthesis
Text preprocessing plays an important role in a textto-speech (TTS) synthesis system. The correct detection and interpretation of input strings influence the overall system accuracy and contribute to the conversion of an unrestricted text into synthetic speech. This paper describes the design philosophy of a preprocessing module for a TTS system in Romanian language. The preprocessor is impleme...
متن کامل